Goto

Collaborating Authors

 proximal term



Lisa: Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Attack

Neural Information Processing Systems

Recent studies show that Large Language Models (LLMs) with safety alignment can be jail-broken by fine-tuning on a dataset mixed with harmful data. For the first time in the literature, we show that the jail-break effect can be mitigated by separating two states in the fine-tuning stage to respectively optimize over the alignment and user datasets. Unfortunately, our subsequent study shows that this simple Bi-State Optimization (BSO) solution experiences convergence instability when steps invested in its alignment state is too small, leading to downgraded alignment performance. By statistical analysis, we show that the \textit{excess drift} towards the switching iterates of the two states could be a probable reason for the instability. To remedy this issue, we propose \textbf{L}azy(\textbf{i}) \textbf{s}afety \textbf{a}lignment (\textbf{Lisa}), which introduces a proximal term to constraint the drift of each state. Theoretically, the benefit of the proximal term is supported by the convergence analysis, wherein we show that a sufficient large proximal factor is necessary to guarantee Lisa's convergence. Empirically, our results on four downstream fine-tuning tasks show that Lisa with a proximal term can significantly increase alignment performance while maintaining the LLM's accuracy on the user tasks. Code is available at https://github.com/git-disl/Lisa.


FedRef: Communication-Efficient Bayesian Fine-Tuning using a Reference Model

Yoon, Taehwan, Choi, Bongjun, De Neve, Wesley

arXiv.org Artificial Intelligence

Federated learning (FL) collaboratively trains artificial intelligence (AI) models to ensure user data privacy. Sharing only model updates generated from local training on client data with the server enhances user data privacy. However, model performance may suffer due to data and system heterogeneity among clients in FL scenarios. Previous studies have proposed model optimization, fine-tuning, and personalization to achieve improved model performance. Despite these efforts, models resulting from FL scenarios often exhibit catastrophic forgetting, which increases the communication and computational costs of clients for model optimization and raises energy consumption. To address these challenges, we propose a reference model-based fine-tuning method for federated learning that overcomes catastrophic forgetting in each round. Our method is derived from Bayesian parameter-efficient transfer learning and includes an proximal term. It employs a reference model that incorporates previous model parameters and reviews previous global features in the model optimization step to mitigate catastrophic forgetting. As a result, our method achieves higher model performance and lower communication and computational costs for clients than existing methods.




Faster Deep Reinforcement Learning with Slower Online Network

Neural Information Processing Systems

Deep reinforcement learning algorithms often use two networks for value function optimization: an online network, and a target network that tracks the online network with some delay. Using two separate networks enables the agent to hedge against issues that arise when performing bootstrapping.


Continual Low-Rank Adapters for LLM-based Generative Recommender Systems

Yoo, Hyunsik, Li, Ting-Wei, Kang, SeongKu, Liu, Zhining, Xu, Charlie, Qi, Qilin, Tong, Hanghang

arXiv.org Artificial Intelligence

While large language models (LLMs) achieve strong performance in recommendation, they face challenges in continual learning as users, items, and user preferences evolve over time. Existing LoRA-based continual methods primarily focus on preserving performance on previous tasks, but this overlooks the unique nature of recommendation: the goal is not to predict past preferences, and outdated preferences can even harm performance when current interests shift significantly. To address this, we propose PESO (Proximally rEgularized Single evolving lOra, a continual adaptation method for LoRA in recommendation. PESO introduces a proximal regularizer that anchors the current adapter to its most recent frozen state, enabling the model to flexibly balance adaptation and preservation, and to better capture recent user behaviors. Theoretically, we show that this proximal design provides data-aware, direction-wise guidance in the LoRA subspace. Empirically, PESO consistently outperforms existing LoRA-based continual learning methods.




Review for NeurIPS paper: A Feasible Level Proximal Point Method for Nonconvex Sparse Constrained Optimization

Neural Information Processing Systems

Summary and Contributions: UPDATE AFTER REBUTTAL Thank you for your response. The authors have agreed to clarify the motivation for the use of non-convex constraints and will be precise about the use of the word "suboptimal". They have agreed to state that they have considerably more knowledge on the specific kind of non-convexity they are dealing with when comparing with prior works. As a consequence I have increased my score, although I believe the experimental section remains rather unconvincing. This paper proposes a method based on a sequence of convex approximations to solve optimization problems with a non-convex sparsity constraint.